Skip to content

Conversation

@Baoyuantop
Copy link
Contributor

@Baoyuantop Baoyuantop commented Nov 6, 2025

Description

After confirmation, the problem described in the original issue also exists in nacos service discovery. Currently, if multiple nacos nodes are configured, a request will randomly select one; if that node fails, the request can only be made again during the next synchronization.

This PR optimizes this behavior so that when a request from one node fails, it will resend the request to other nodes.

Which issue(s) this PR fixes:

Fixes #12610

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@Baoyuantop Baoyuantop marked this pull request as ready for review November 7, 2025 00:55
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 7, 2025
@dosubot dosubot bot added the bug Something isn't working label Nov 7, 2025
@Baoyuantop Baoyuantop requested a review from membphis November 21, 2025 01:11
else
local ok, err = fetch_from_host(base_uri, username, password, infos)
if ok then
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remember index of healthy one in a variable in this lua module, so that we can chose the healthy one nacos host directly in next sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea is that the states of health and unhealth are unreliable without health checks, as they can change at any time. Therefore, I don't see the need to store these states.

The current implementation is only a minor improvement; adding health checks will completely eliminate this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: dynamic upstream, one Eureka node is unavailable, half of the requests are lost after reloading

2 participants